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Abstract 

We have investigated the mechanism and the evolutionary pathway of protein dimerization through 
analysis of experimental structures of dimers. We propose that the evolution of dimers may have multiple 
pathways, including (1) formation of a functional dimer directly without going through an ancestor 
monomer, (2) formation of a stable monomer as an intermediate followed by mutations of its surface 
residues, and (3), a domain swapping mechanism, replacing one segment in a monomer by an equivalent 
segment from an identical chain in the dimer. Some of the dimers which are governed by a domain 
swapping mechanism may have evolved at an earlier stage of evolution via the second mechanism. Here, 
we follow the theory that the kinetic pathway reflects the evolutionary pathway. We analyze the 
structure-kinetics-evolution relationship for a collection of symmetric homodimers classified into three 
groups: (1)14 dimers, which were referred to as domain swapping dimers in the literature; (2) nine 
2-state dimers, which have no measurable intermediates in equilibrium denaturation; and (3), eight 3-state 
dimers, which have stable intermediates in equilibrium denaturation. The analysis consists of the following 
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stages: (i) The dimer is divided into two structural units, which have twofold symmetry. Each unit 
contains a contiguous segment from one polypeptide chain of the dimer, and its complementary 
contiguous segment from the other chain, (ii) The division is repeated progressively, with different 
combinations of the two segments in each unit, (iii) The coefficient of compactness is calculated for the 
units in all divisions. The coefficients obtained for different cuttings of a dimer form a compactness 
profile. The profile probes the structural organization of the two chains in a dimer and the stability of the 
monomeric state. We describe the features of the compactness profiles in each of the three dimer groups. 
The profiles identify the swapping segments in domain swapping dimers, and can usually predict whether 
a dimer has domain swapping. The kinetics of dimerization indicates that some dimers which have been 
assigned in the literature as domain swapping cases, dimerize through the 2-state kinetics, rather than 
through swapping segments of performed monomers. The compactness profiles indicate a wide spectrum 
in the kinetics of dimerization: dimers having no intermediate stable monomers; dimers having an 
intermediate with a stable monomer structure; and dimers having an intermediate with a stable structure 
in part of the monomer. These correspond to the multiple evolutionary pathways for dimer formation. 
The evolutionary mechanisms proposed here for dimers are applicable to other oligomers as well. 

Keywords: compactness; dimerization; domain swapping; intermediate; kinetics; oligomer evolution 



Article Contents 

(You can also go directly to the beginning of the text .) 

Introduction 
Results 

Compactness profile 

Fig. 1. Schematic diagram showing the cutting method of a dimer 
Equation 1 

Table 1. Summary of dimer groups and types 

Fig. 2. Compactness profile, i.e., coefficient of compactness (Q as a function of cutting ratio /, for 
the domain swapping dimers 

Fig. 3. Structural examples of domain swapping dimers for Figure 2 

Fig. 4. Compactness profile, i.e., coefficient of compactness (Q as a function of cutting ratio / for 
the 2-state dimers 

Fig. 5. Structural examples of 2-state dimers for Figure 4 

Fig. 6. Compactness profile, i.e., coefficient of compactness (Q as a function of cutting ratio / for 
the 3 -state dimers 

Fig. 7. Structural examples of 3 -state dimers for Figure 6 

Table 2. Domain swapping dimers 

Table 3. 2-state protein dimers 

Table 5. 3 -state protein dimers 

Properties of dimer interfaces 

Table 4. Dimer properties 

Equation 2 

Fig. 8. The relationship between DeltaMSA and p for the 2-state dimers, the 3 -state dimers, and the 
domain swapping dimers marked by PDB codes 
Discussion 

Implication of the compactness profile 
Domain swapping 



2 of 23 



2/4/02 5:05 PM 



Protein Science 7(3): 533-544. Protein dimerization 



http://www.prosci.uci.edu/ArticlesA/ol7/issue3/729 1/729 1 html 



Dimer kinetics 
Evolution of dimers 
Materials and methods 
Data set selection 

Calculation of the compactness profile 
Equation 3 
Equation 4 

Calculation of hydrogen bonds 
Acknowledgments 
References 



Introduction 

Oligomers are often the functional forms of proteins. It has been a challenging problem in protein science 
to understand the mechanism of oligomerization and the evolution of protein oligomers. The importance 
of this problem is related not only to the origin of oligomers, but also to the design of functional 
oligomers. A recent hypothesis proposed by Eisenberg and colleagues provides a new insight to this old 
topic. The authors found an interesting phenomenon called three dimensional domain swapping from a 

high resolution structure of diphtheria toxin (Bennett et al., 1994), They also extended the mechanism to 

other oligomers (Bennett et al., 1994 : Bennett et al, 1995 V In the structure of a domain swapping £iSmUL* 
oligomer, one segment of a monomeric protein is replaced by the same segment from an identical chain.7}^^, • 
Eisenberg and co-workers further suggest that domain swapping is possibly the mechanism of the *fQ 
evolution of oligomerization, in general. In this hypothesis, a mononer served as the pre-evolved form of ^ 
an oligomer. The interactions between monomers in an oligomer have been pre-optimized within the T)tfyvr<*<vv 
monomer at the interface between the swapping segment and the rest of the monomer. Hence, the ^uo^f^vl^ 

formation of an oligomer does not rely on chance association or on mutations of surface residues . 

(Bennett etaL 1995 V 

Despite the inspiring beauty of the hypothesis, it is has been questioned by D'Alessio and colleagues 
(Piccoli et al., 1992 ; D'Alessio, 1995). These authors have argued that domain swapping may not be a 
general mechanism for oligomer evolution, in particular, not for the case of bovine seminal ribonuclease 
(BS-RNase). BS-RNase can form two types of dimeric conformations, one with swapped N-terminal 
segment, and the other without segment swapping. The two structures coexist, and the conformation 
with swapping occurs only after the non-swapped dimer is formed. Hence, the dimerization occurs 
independently of the ability of the monomers to swap their N-terminal tails with other monomers. 
D'Alessio et al. further assume that the kinetic pathway of the dimerization captures the evolutionary 
pathway of the dimer formation. They suggest that the dimer with segment swapping may not be an early, 
necessary station in the evolution of the dimerization. Rather, it may be an evolutionary product at a later 
stage with the versatile biological function of allosteric regulation (D'Alessio, 1995 ). 

Deriving the evolutionary pathway through the kinetic pathway paves the way for studies of the evolution 
of protein oligomers in general. D'Alessio et al. have discussed only the case of BS-RNase. The kinetic 
pathways of oligomerization are often complicated. Equilibrium denaturation experiments have shown 
that there are two types of equilibrium transitions (Neet & Timm, 1994 ) in dimerization. One type of 
denaturation is the 2-state, i.e., the native dimer state and the denatured monomer state. There is no 
stable intermediate in between. The other type of denaturation is the 3-state, i.e., the native dimer state, 
the stable monomer state, and the denatured monomer state. If the stable monomer state has the same 
structure as the one it possesses when in the dimer, the dimerization belongs to "rigid body," three-state 
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binding. In other 3 -state dimers, the intermediate state deviates its structure from the one in the dimer. 
For example, aspartate aminotransferase has a molten globule monomer intermediate (Herold & 
Kirschner, 1990). Such a wide spectrum in the kinetics of oligomerization illustrates the necessity for an 
analysis of a collection of oligomers. 

The evolution-kinetics relationship can be studied from the kinetics-structure relationship. In the case of 
the monomers, the protein structures can often be divided into several compact units. Due to hydrophobic 
effects, compact units are generally more stable in the solvent than non-compact ones. In the folding 
kinetics, large monomers typically form several initial collapsed nuclei as compact units or domains, 
followed by an assembly of these stable entities (Wu et al., 1994 ; Tsai & Nussinov, 1997a ). 
Correspondingly, it has been proposed that multi-domain monomers may have evolved from proteins 
having single-domains via domain insertion (Russell, 1994 ). Some oligomers can also be divided into 
several compact units, while others intertwine their chains to form only a single compact unit in one 
oligomer. The organization of the chains of an oligomer in space reflects the kinetics of the 
oligomerization, and hence, is likely to show traces of the evolutionary journey of the oligomer. 

In this paper, we apply the structure-kinetics-evolution connection to explore the mechanism of 
oligomerization and the evolution of oligomers. For this purpose, we have carried out a computational 
analysis on three groups of dimers, i.e, 2-state dimers, 3-state dimers, and domain swapping dimers. All 
the dimers that we have analyzed are symmetric homodimers, which are the simplest oligomers. A 
symmetric homodimer consists of two identical peptide chains that are in twofold symmetry in the three 
dimensional structure. To probe the organization of the two chains, we divide the dimer into two 
symmetric units. Each unit contains a segment of one chain and the complementary segment of the other 
chain. Hence, the composition of each unit includes all the amino acids in the sequence of a chain, and the 
two units are symmetric in their structures. The shape of the units is assessed by their coefficients of 
compactness (Zehfus & Rose, 1986 ). We obtain a profile of the compactness for different divisions of the 
dimer. By describing the landscape of how intimately the two chains of the dimer integrate with each 
other structurally, the profile is capable of predicting domain swapping. Furthermore, it sheds light on the 
intermediate states between native dimers and unfolded monomers. Conclusions drawn from the study of 
dimers can be applied to oligomers in general. 

In the following, we present the compactness profiles and the properties of dimer interfaces, illustrating 
some structural details. We discuss the implications of our computational analysis to the evolution and 
the mechanism of protein oligomerization. Finally, we describe the data set and the methods which we 
have employed. 

Results 

In this section, we present the results of the compactness profiles, together with some structures, for 
domain swapping, 2-state, and 3-state dimers. We include the results of the calculations of their surface 
areas and interfacial hydrogen bonds. 

Compactness profile 

The principle of the calculation of the compactness profile of a dimer is illustrated in Figure 1 . 
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i=k i=k+l 

Fig. 1 Schematic diagram showing the cutting method of a dimer. The chain with black balls and the one 
with white balls represent two identical peptide chains of the dimer, whose boundary is shown in the 
square boxes. A cutting divides the dimer into two units, each including a segment from one chain and the 
complementary segment from the other chain. The two units also have 2-fold spatial symmetry. The 
dashed lines show the cutting at position k (left) and at position k + 1 (right). 

A dimer consists of two identical protein chains whose structures are in twofold symmetry. Assume that 
each chain has residues 1, 2, 3, N- 1, N. The dimer is cut into two identical units, with each unit 
containing two non-contiguous segments, as divided by the dashed lines in the figure. The left figure 
describes the cutting at position / = k. The newly formed unit is composed of k contiguous residues from 
one chain from its N-terminus (black balls in the figure), i.e, residues 1, 2, 3, k - 1, k, and N- i 
contiguous residues from the other chain until its C-terminus (shown in white balls), i.e., residues k + 1, k 
+ 2, N- \,N. Such a cutting method reorganizes a dimer into two structurally symmetric units, each 
laving the same composition as the original chain. We define the cutting ratio / as: 

t = UN> ^ (I) 



Next, we proceed to cut the dimer at / = k + 1 (shown in the right figure), and so on. The dimer is cut 
progressively at / = 0, 1, 2, ... k, k + 1, AT- 1, TV. The compactness profile is the coefficient of 
compactness (Q of the units as a function of the cutting ratio /, where 0 <= t <= 1 . 

As t increases, the cutting position shifts from the N-terminus to the C-terminus. At / = 0 or / = 1, the 
value of C gives the coefficient of compactness for the whole chain. For a symmetric homodimer, C at / = 
0 should be identical to C at / = 1 . However, there is a minor difference in some actual calculations due to 
the crystal packing of the dimer and the errors in the structure. Nevertheless, the difference is 
insignificant. In the following, we will classify each group of dimers (domain swapping, 2-state, or 3 -state 
dimers) into several types according to the behavior of their compactness profiles, as summarized in 
Table 1 . 
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Table 1. Summary ofdtmer groups and types* 



Group 


Type 


Compactness profile 




Structural feature 




Domain swapping dirncfs 


(a) 


Global min. around middle 


A compact domain swaps 


2M>2 


(Table 3) (figs, 2. 3) 




Local min. around both ends 


to another compact domain 


Iddt 






Global min. cl se t side 


Compact domain with short 


Ifia 






Local ltd it /ma* arannri enrlc 


umvMiijiaii sw4ipping. MJglltUlH 


loop 




(c) 


Global min. close to center 


ConiDHCt domain with lono 


l to i „ 




(d) 


Local tttiti7ttKix. around ends 


v-m^wi* ijaivi swapping ^vgmliH 


1 puc 




Global min. around middle 


*>W wuilipaw* UUilrUiff Jl? iTlLrnvJllvr* 


ICuC 






Global max, around both ends 


two chains intertwined 


Irfb, 


2* state dimcrs 


w 


Global max. around middle 


Flat lar$e interlace; 


IbcL 


(lable 4) (Figs. 4, 5} 




Global min. around both ends 


no swapping segment 


lrpt> 




W 


No deep min* or max. 


Ginitis am short; 


Icta, 






The nrofilc K flat 


fUk/Pt -f'H otitic ITfcfV'ffll.M'M**/! 

iwu vtutiiia mien wjneu 


zzta 




(c) 


Global min. around middle 


Uticomoact segment svwirw to 


lhhn 






Loca] inln : /max. around ends 


a relatively combaei domain 


-I VW |. 




(<0 


Local min* around middle 


Unco impact swappi ng segment 


2rvh 






Global min. close to an end 


in the middle of a sequence 




.t-siaie- aimers 


w 


Global max, around middle 


its i* 1 + 

Flat small: interface; 


lly« 


(Tabic 5) (Figs- 6, 7) 




Global min, around both ends 


no swapping segment 


IXSO 




(V) 


Global min. close to an end 


Compact domain widi short 








Local min /max. around ends 


Ufleompaci swapping segment 


liar 






Local min. around middle 


Inter-chain domain packing: 








Global min, around both ends 


no swapping segment 


ItK) 



a The features of the compactness profiles and related structures in the different groups and types of dimcrs. *'rmV T : minimur 
maximum. 



Table 1. Summary of dimer groups and types 0 

The numbering of (a), (b), ... for different types in Table 1 is consistent with the numbering of (a), (b), 
etc., marked in the figures of the compactness profiles and of the structures ( Figs. 2 . 3, 6, 7). 
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Fig. 2 Compactness profile, i.e., coefficient of compactness (C) as a function of cutting ratio //for the 
domain swapping dimers. A: The global minimum is close to the middle and the two ends (7 = 0, 1) are 
around local minima. B: The global minimum is close to the one end (N< 0.2 or N> 0.8). C: The global 
minimum is close to the middle and the one end is around a local minimum and the other end is around a 
local maximum. D: The global minimum is close to the middle and both ends are around global maxima. 
The protein names of marked PDB codes are listed in Table 2 . 
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Fig. 3 Structural examples of domain swapping dimers for Figure 2 . The two chains are differentiated 
utilizing blue and red colors. A: Diphtheria toxin (lddt). Left graph: residues 1-379 in the thick blue line 
and residues 380-535 in the thick red line; right graph: residues 1-187 in the thick blue line and residues 
200-535 in the thick red line. Residues 188-199 are not shown, since they are unavailable in the crystal 
structure. B: BS-RNase (lbsr) with residues 22-124 in the thick blue line and residues 1-21 in the thick 
red line. C: Odorant-binding protein (lobp) with residues 2-121 in the thick blue line and residues 
122-159 in the thick red line. D: N-terminal domain CD2 (lcdc) with residues 4-45 in the thick blue line 
and residues 46-99 in the thick red line. The thick lines and thin lines show the two selected symmetric 
units which have the lowest C value in the compactness profile, except for the right graph of (A), which 
shows a local minimum of C. This picture and Figures 5 and 7 were generated by the program QUANTA 
(Molecular Simulations, 1994 ). 



A B 




C +"\D 




o aa a a 0.5 *b i. o 02 0.4 0 0 ?,o 



Fig. 4 Compactness profile, i.e., coefficient of compactness (C) as a function of cutting ratio / for the 
2-state dimers. A: Both ends (/ = 0, 1) are around the degenerated global minima. B: C does not vary 
significantly along /. C: Domain swapping dimers. D: Both ends (/ = 0, 1) are around the degenerated 
global minima, while there is a deep local minimum in the middle. The protein names of marked PDB 
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D 

Fig. 5 Structural examples of 2-state dimers for Figure 4 . The two chains are differentiated by blue and 
red. A: Two monomers of repressor of primer (lrpo), shown in blue and red, respectively, B: Two 
monomers of troponin C site III (lcta), shown in blue and red respectively. C: Trp aporepressor (3wrp) 
with residues 44-108 in the thick blue line and residues 8-43 in the thick red line. The two units shown in 
the thick line and in the thin line corresponds to the cutting with the lowest C value in the compactness 
profile. D: Gene V protein (2gvb). Left graph: residues 1-61 in the thick blue line and residues 62-87 in 
the thick red line. The correspondent C value of the cutting is 2. 196, which is the lowest along the 
compactness profile. Right graph: residues 1-64 and 80-87 in the thick blue line and residues 62-79 in the 
thick red line. The cutting yields a C value of 2.053, which is the lowest in all possible cuttings which 
divide the dimer into two symmetric blocks. 
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Fig. 6 Compactness profile, i.e., coefficient of compactness (Q as a function of cutting ratio / for the 
3 -state dimers. A: Both ends (/ = 0, 1) are around the degenerated global minima. B: Dimers with small 
segment swapping. C: Both ends (/ = 0, 1) are around the degenerated global minima with a deep local 
minimum around the middle. The protein names of marked PDB codes are listed in Table 5 . 
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Fig. 7 Structural examples of 3 -state dimers for Figure 6 . The two chains are differentiated by blue and 
red. A: The two monomers of superoxide dismutase (lxso), shown in blue and red, respectively. B: 
Aspartate aminotransferase (Itar) with residues 3-13 in the thick blue line and residues 14-410 in the 
thick red line. C: Glutathione S-transferase (lglq) with residues 79-209 in the thick blue line and residues 
1-78 in the thick red line. In (B) and (C), the thick lines and thin lines show the two selected symmetric 
units which have the lowest C value in the compactness profile. 

Figure 2 shows the compactness profiles of domain swapping dimers. Assume / = / is the cutting 
position obtaining the global minimum of C, i.e., t m reflects the most compact unit among all possible 
units derived from the progressive cutting of the dimer. Hence, t m identifies the exact residue range of the 
swapping segment. The results are listed in Table 2 . 
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Interface 


Molecule 


lddt (sym) 


Diphtheria toxin 


IbsrAB 


BS-RNase 


lcdcAB 


N-teiminal domain CD2 


IspcAB 


Spectrin 


2bb2 (sym) 


0B2 crystallin 


thulAB 


IntcrIeukin-5 


lilk. (sym} 


IntexIeiLkin-1 0 


IrftAB 


Interferon y 


lobpAB 


Odorant-bioding protein 


Ipuc (sym) 


Protein sucl 


lhumAB 


HMIP-JB 


lfiaAB 


FIS protein 
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ResoJ, 
(A) 


Size 
(residues) 


Swapped S' 
(residu 


i n 
Z.U 


535 


_____ 

380-5 


i t\ 

19 


124 


1-2 


2.0 


99 


46-9 


1.8 


107 


7*-I> 


2.1 


181 


-2-8 


2.4 


108 


89-1 


t Q 
1.0 


151 


117-li 


3.0 


119 


82-1 


2.0 


159 


122-1. 


1.95 


105 


89-1 1 


(NMR) 


69 


1-8 


2.0 


98 


56-9 



Domain swapping dimers employed ia the analysis. The table describes the PDB code with the chain names name a 
the structure, number of ammo acids in a chain, ihe swapping segment predicted by the compactness profile and the 
Interface column indicates that the dimer was generated from a single chain in the PDB by using the symmetry operation ( 
ui (he ■TtesaUiiiDn" column indicates the dimer is from an NMR structure. The residue numbers in the swapping segment 
dimer s PDB code. The stability indicates whether a dime? can maintain its structure under normal physiological condition! 
exist under some special conditions. Forcxample, diphtheria toxin (lddt) Is in a dimeric form in the crystal structure but it i 
et a!., J994). The first seven cases were taken from the reference (Bennett et al., 1995). The next three cases are from 
(Bennett et al., 1994) loop (Tegoili et al.. 1996) Ipuc (Bourne et al., 1995). The last two cases, lhum and Ifia wer 
hydrophobic folding units (Tsai & Nussinov, 1997b). 



Table 2. Domain swapping dimers" 

The landscape of the profile can be roughly classified into four types. Type (a) shown in Figure 2A 
describes the case where the global minimum is around the middle and the two ends (t = 0, 1) are around 
local minima. Diphtheria toxin (lddt), depicted in the left graph in Figure 3 A . is an example. The thick 
line indicates the cutting segment corresponding to t m . In this cutting, a segment (residues 380-535 
shown in thick red line) of one chain, swaps to integrate with its complementary portion (residues 1-379 
in the thick blue line) of the other chain. There is a clear barrier between either of the ends and the global 
minimum in the compactness profile. The compactness is reduced as the cutting units start to deviate 
from the single chain, until / approaches t m . Hence, the conformation of each chain alone is relatively 
compact. Since a compact structure in the solvent is energetically more favorable than a non-compact 
segment, the monomeric conformation observed in the dimer is in a meta-stable state prior to 
dimerization. Given the compactness both at / = t m and at / - 0, 1, each chain can be divided into two 
compact blocks, one with residues 1-379 and one with residues 380-535, as shown by the thick and thin 
lines, respectively, within a chain. There is a local minimum at 1 = 0.35, with the conformation shown in 
the right graph of Figure 3 A . The cutting unit is formed by residues 1-187 (thick blue line) of one chain, 
and residues 200-535 (thick red line) of the second chain. This local minimum demonstrates that the 
segment 1-379 can be further divided into two compact domains: 1-187 and 200-397 (there are no 
coordinates available for residues 188-199 in the PDB). 

Types (b) and (c) of domain swapping dimers shown in 2C are similar. In both cases, there is a local 
minimum at one end of the profile (( = 0 or / = 1) and a local maximum at the other end. In (b) the global 
minimum is close to one end (t < 0.2 or / > 0.8), while in (c), the global minimum is around the middle. 
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However, the difference between (b) and (c) is not clear cut. Figure 3B shows the conformation of 
BS-RNase (lbsr), an example of the cutting in Figure 2B . The unit at / = t contains residues 22-124 

(thick blue line) of one chain and residues 1-21 (thick red line) of the other. Figure 3C depicts an example 
taken from Figure 2C i.e., the cutting of odorant-binding protein (lobp) at / = / It represents the 

conformation with residues 2-121 (thick blue line) from one chain, and residues 122-159 (thick red line) 
from the other chain. Both type (b) and (c) dimers have compact domains starting from a sequence 
terminus, i.e., residues 22-124 in lbsr and 2-121 in lobp. The other portion of the chain (residues 1-21 of 
lbsr or residues 122-159 of lobp) is swapping to the compact domain. This portion is not compact 
therefore, its structure in the dimer is unstable before dimerization. The swapping segments of (b) and (c) 
differ in their contribution to the stability of the dimer. The stability of the dimer is mostly contributed by 
the segment swapping in (c). The swapping segments in (b) contribute much less and may only help 
further strengthen the dimers, in particular in the case of BS-RNase (Piccoli et ah, 1992 ; D'Alessio, 
1995 V 

Type (d) of domain swapping dimers shown in Figure 2D has the global minimum of C around the middle 
and the degenerated global maxima of C at both ends. As an example in Figure 3D . the cutting at / = t in 

the N-terminal domain CD2 (Icdc) composes residues 4-45 (thick blue line) of one chain, and residues 
46-99 (thick red line) of the other. The two chains in the dimer intertwine with each other. The 
conformation of the whole chain or any segment within the chain is not compact, and hence, the 
monomer is unstable by itself prior to dimerization. 

Figure 4 gives the compactness profiles of dimers shown by experiments to have 2-state kinetics (listed in 
Table 3 Y 



Table 3. protein dimers 1 
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mm 


87 
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thhp (sym) 


HIV-I protease 
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99 
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Ilfb(sym) 


LF&t transcription factor 
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99 


De Francesco ct 
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36 


Monera et al., 19 
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53 
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2ztaAB 


Leucine zipper 
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J The 2-slatc dimers. The table describes the PDB code whh the Chain names, name of the molecule, the resolution of the 
number of amino iickh in a single chain, and the reference which shows the dimerization is 2-state. 



Table 3. 2-state protein dimers 0 

The landscape of the profile can be roughly divided into four types. In type (a), both ends of the 
compactness profile are around the degenerated global minima of C. The repressor of primer (Irpo), 
whose structure is shown in Figure 5 A, provides an example. The dimer interface is flat, and there is no 
swapping segment between the two chains. In type (b), C is around 2.0 and the fluctuation is relatively 
small along / (less than 10%). As demonstrated in Figure 5B for troponin C site III (lcta), the dimers are 
very small. Each chain is relatively compact and a large portion of either chain's surface area is buried in 
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the interface. The compactness profile of type (c) is similar to that of domain swapping dimers. The 
compactness profile of HIV- 1 protease (lhhp) matches type (b) of domain swapping, which involves only 
a short segment exchange across the two chains. Trp aporepressor (3wrp) matches type (c) of the domain 
swapping dimers. As shown in Figure 5C residues 44-108 (thick blue line) and residues 8-43 (thick red 
line) intertwine with each other. However, residues 44-108 form a relatively compact segment. In type 
(d), both ends are around the degenerated global minimum, while there is a deep local minimum around 
the middle. The left graph of Figure 5D shows the conformation of gene V protein (2gvb) at / = / The 

structure consists of residues 1-61 from one chain (thick blue line) and residues 62-87 from the other 
chain (thick red line). The value of C at / = t m is 2.196. Since there is no global minimum in the middle, it 

is not predicted as a domain swapping dimer by the compactness profile. However, one can see that there 
is a swapping segment in the middle of the sequence. We carried out a search for the minimum C around 
residues 62-87 in all possible symmetric cuttings, i.e., without the limitation that a chain can only be 
divided into two contiguous segments. The search yields a compactness of 2.053, which is lower than C 
at / = t m in the previous cutting. The conformation of the new symmetric unit is composed of residues 

1-64 and 80-87 (both in the thick blue lines) from one chain, and residues 62-79 (thick red line) from the 
other chain, as shown in the right graph of Figure 5D . In this particular case, the compactness profile 
failed to predict 2gvb as a domain swapping dimer, since the swapping segment does not start from a 
terminus of the chain. 

Figure 6 presents the compactness profiles of dimers showing 3 -state kinetics in experiments (see Table 
5). 



Table 5> 3 -siate protein dimers n 
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number of amino acids in a single chain, and the reference which shows the dimerization h 3-siate, 



Table 5, 3-state protein dimers/ 0 

There are roughly three types. In type (a), both ends are around the degenerated global minima, as shown 
in the example of superoxide dismutase (lxso) in Figure 7 A . The dimer interface is flat, with no swapping 
segment exchanges between the two chains. It is similar to type (a) of the 2-state. However, the interface 
of the 3-state dimer occupies a much smaller portion of the monomer's surface area than the one of the 
2-state. In type (b), there is a small segment swapping in the dimer. This belongs to type (b) of the 
domain swapping dimers. As shown in Figure 7B . aspartate aminotransferase (ltar) swaps residues 3-13 
(thick blue line) to the major portion of the monomer, i.e., residues 14-410 (thick red line). In type (c), 
both ends are around the degenerated global minima with a deep local minimum in the middle. The profile 
is similar to type (d) of the 2-state. However, structurally they do not share a common feature. Figure 7C 
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shows the conformation of glutathione S-transferase (lglq) at / = t m . Residues 79-209 (thick blue line) 

and residues 1-78 (thick red line) are two compact domains of the monomer. They do not intertwine. The 
local minimum originates from the packing of the two domains between the two chains rather than within 
a monomer. 

Properties of dimer interfaces 

We evaluated the hydrogen bonds across dimer interfaces for the 2-state, 3-state, and the domain 
swapping dimers, as shown in Table 4 . 



Tarjle 4> Dimer properties" 
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*AMSA: t0U\l buried surface areas of the dimer interface; />: the ratio 
bciween and the total surface area of the two unbound monomers in 

Lhc dimer, hbond: number of hydrogen bonds across the dimer interface. 
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There is no significant signature in this aspect for the three groups of dimers. We calculated the total 
buried molecular surface area (MSA) across the dimer interface, i.e., DeltaMSA. It is obtained from the 
difference between the surface area of the dimer and the sum of the surface areas of the two separate 
chains, i.e., MSA^ and MSA 2 (MSA^ and MSA 2 should be identical in a symmetric homodimer, but they 

may be slightly different in the actual PDB due to crystal packing and errors in the structure). The ratio p 
between the interfacial MSA of a monomer and its total MSA is derived as 



AAfSA 



MSA\ + MSA 2 



(2) 



DeltaM£4 and p are listed in Table 5 . Figure 8 draws the relationship between DeltaMSA and p 
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Fig. 8 The relationship between DeltaM£4 and p for the 2-state dimers (solid line), the 3 -state dimers 
(broken line), and the domain swapping dimers marked by PDB codes (dots and crosses). DeltaMSA is 
the total interfacial buried surface area of the dimer; p is the ratio between DeltaMSA and the total 
surface area of the two unbound monomers in the dimer. Type (d) of the domain swapping dimers are 
indicated by crosses, and other types are shown in dots. The protein names of marked PDB codes can be 
found in Table 2. 



3 -state dimers (dashed line) have a small p value around 0.1, and p does not increase significantly with 
increasing DeltaM£4. The p of the 2-state dimers (solid line) increases dramatically with increasing 
DeltaMSA. This indicates that the interface of a 3 -state dimer only involves a small portion of the surface 
area of the monomer as shown in Figure 7 , while the interface of a 2-state dimer often composes a large 
portion of the surface area in each chain as shown in Figure 5 . The p values of domain swapping dimers 
(shown by dots and crosses) scatter in between the 2-state and the 3 -state. Type (d) of domain swapping 
dimers, marked by crosses, shows the 2-state behavior, except for interferon (lrfb), whose compactness 
profile is shallow compared to the other four cases of this type (see Figure 2D ). This again shows that 
most type (d) dimers of domain swapping follow a 2-state profile. 

Discussion 

In this section, we discuss the attributes of the compactness profile method and the characteristics of 
domain swapping. We further address the kinetics of dimerization and the evolution of dimers based on 
our computational analysis. 

Implication of the compactness profile 

The compactness profile is a useful tool to probe the geometric organization of the two chains in dimers. 
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It gives not only a single number of the coefficient of compactness for the whole chain, but also how the 
two chains of a dimer pack together. A special advantage of the method is that the calculation of the 
compactness coefficients versus different cuttings in a dimer are carried out for units with identical amino 
acid composition. Hence, such a comparison between compactness coefficients of a dimer is related 
solely to the geometry (shape) of the units, rather than to their physical or chemical properties. If a dimer 
is known to have domain swapping, one can find the global minimum of the compactness profile to 
pinpoint accurately the domain swapping segment. In most cases, the compactness profile can be used to 
predict whether a dimer is domain swapped, and what type of swapping it is, as summarized in Table 1 . It 
may fail to identify the case where the swapping domain takes place in the middle of a chain (see Fig. 
5D). However, this is very rare in dimers. The compactness profile can also identify which segment forms 
a compact domain in a chain. The compactness profile can be similarly applied to other oligomers. 

Domain swapping 

The information obtained from the compactness profile allows us to revisit the definition of domain 
swapping. Many dimers are referred to as domain swapping dimers in the literature based on the 
structures, as shown in Table 3 . However, among the four types of domain swapping dimers shown in 
Figures 2 and 3, only type (a) fits the "classical" meaning of domain swapping dimers. There, the parts 
swapping between the two chains are stable domains. In types (b) and (c), there is a stable domain in each 
chain. However, the swapping portions between the two chains are not compact, and do not fit the 
definition of a protein domain in the conventional sense. This type of dimers may be dubbed "segment 
swapping dimer," rather than domain swapping dimers. Type (d) of domain swapping is similar to the 
types (a) and (b) in the 2-state. It cannot form a compact unit along any portion of the chain. Hence, we 
suggest that this type of dimers be classified as 2-state, rather than domain swapping. The compactness 
profile provides a computational tool determining whether a dimer has domain/segment swapping. On the 
other hand, if the compactness profile has a global minimum in the middle, and there are local minima at 
both ends, the dimer is predicted to be a domain swapping dimer; if the compactness profile has a global 
minimum in the middle and only one end has a local minimum, the dimer is predicted to be a segment 
swapping dimer. Lastly, if there is a local minimum in the middle and two global minima at both ends, as 
shown in Figure 4D . it is possible that a segment in the middle of one chain swaps to form a compact unit 
with two complementary segments of its sister chain. 

Dimer kinetics 

The difference between 2-state and 3 -state dimers can be revealed in several ways. First, it can be 
detected directly by kinetics, i.e., whether a monomer has a stable structure on its own in the solvent 
prior to dimerization. Second, as our early work shows, the differences between the two groups of dimers 
are also found in energetics, in thermodynamics, and in structures. Hydrophobic effects are generally 
more important to 2-state complexes than to 3 -state ones (Tsai et al., 1997 : Xu et al., 1997 ). The 
interfacial motif of a 2-state dimer is far more likely to find a monomer in the PDB with a good structural 
alignment than a 3-state dimer does (Tsai et al., 1997 ). A 2-state complex is more likely to reach its 
global minimum state in free energy than a 3-state one (Xu et al., 1997 ). 2-state binding is similar to 
folding of a single-domain monomer, a process which typically fits the 2-state kinetics as well (Zwanzig, 
1997 ). In both cases, the formation of the final state from the initial state does not depend on a unique 
reaction pathway or a stable intermediate. The lack of linkage between two chains in a 2-state dimer is 
not critical. This is well illustrated by barnase, where an assembly of the polypeptide fragments of the 
monomer forms a native-like conformation (Kippen et al., 1994 ). Consistently cutting a chain in dimeric 
trp repressor still keeps the native dimer structure basically intact (Tasayco & Carey, 1992 ). On the other 
hand, a 3-state dimer depends substantially on the kinetic pathway from the denatured monomers to the 
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complex. In particular, it has to go through the intermediate, i.e., the stable monomer state. 



Our study sheds some light on the classification of dimers. As seen in Figure 8 , many domain swapping 
dimers are in between 2-state and 3-state. This suggests that the kinetics of their dimerizations is not 
2-state. On the other hand, their intermediates are not as stable as the intermediates in the 3-state 
dimerizations. As shown by the compactness profiles, in the domain swapping dimers of types (b) and (c), 
only a portion of the chain is folded into a compact unit. The remainder of the chain is not compact in the 
dimer structure. This suggests that the intermediates of these dimers are most likely monomers having 
some, relatively stable, structures for the compact domains but no unique structures for the other 
portions. That is, in the intermediate state, the compact domain of the chain has substantially less 
structural fluctuations than the non-compact portion. From this standpoint, the definition of 2-state 
versus 3-state is not clear cut for dimers. There is a broad spectrum in the dimerization: from no 
intermediate at one end to rigid binding at the other. 

Evolution of dimers 

The kinetics of dimerization can be revealed in the structures of the dimers. Based on the assumption that 
the kinetic pathway reflects the evolutionary path (D'Alessio, 1995 ). we may propose evolutionary 
pathways from monomer to dimer. Although the kinetics of dimerization takes place on a time scale of 
seconds while the evolution of dimer may have taken millions of years, both processes share a similarity 
in terms of energetic stability. The kinetics of dimerization is along the pathway which is optimal among 
all possible paths governed by the reaction free energy function. The evolution of the dimer has been 
along the pathway which optimized the biological function of the protein, typically accompanied by the 
optimization of its energetic stability. Hence, the kinetic pathway of dimerization is likely to follow the 
same course trodden by evolution. As discussed above, dimerizations have a spectrum of kinetics, from 
the 2-state with no intermediate, to the rigid binding where the intermediate monomer has the same 
conformation as that in a dimer. In between, the intermediate may have some structure but different from 
the one observed in the dimeric configuration. For example, some intermediates have molten globular 
structures or have stable domains connected by unstable loops. The diversity of the kinetics indicates that 
evolution may have multiple pathways. According to the types of the kinetics, we suggest three possible 
pathways. 

The first evolutionary pathway of dimerization follows the mechanism which governs the evolution of 
single-domain monomers, without going through an ancestor monomer. Monomeric proteins have 
evolved through gene mutation, deletion, and fusion to become foldable and functional proteins. 
Although a dimer contains two chains, the linkage between the chains is not important in many cases 
(Tasayco & Carey, 1992 ; Kippen et al., 1994 V The motifs of some dimer interfaces align perfectly with 
the motifs of some protein monomers. For example, the repressor of primer, a 2-state type (a) dimer 
shown in Figure 5 A . matches well a four-helix bundle motif in the lrpr monomer (Tsai et al., 1997 ). The 
tethered dimer, which consists of two 99-amino acid HIV-PR subunits linked together by a pentapeptide, 
maintains basically the same structure as the natural HIV-PR dimer (Cheng et al., 1990 ). We suggest that 
during evolution, some dimers have evolved through gene mutation, deletion, and fusion, similarly to the 
monomer evolution. Here, the protein chain was optimized directly for the functional dimer, rather than 
for any intermediate. The dimers did not have stable monomers in their evolutionary pathways. Types (a) 
and (b) of the 2-state dimers, and type (d) of the domain swapping dimers, may have evolved through 
such an evolutionary mechanism, since their dimerization kinetics resembles the folding kinetics of 
monomers. 

The second evolutionary pathway to dimerization is through a stable monomer as an intermediate 
followed by mutations of surface residues in the monomer. Types (a) and (c) of the 3-state fit into this 
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category. The ancestors of these dimers were structured monomers. Since the monomers were stable in 
the solvent, mutations of surface residues could effectively make the surface and energetics 
complementary between two monomers to form a dimer. The interfaces of such dimers are likely to be 
small, and key residues for the binding are few (Novotny et al., 1989 ; Cunningham & Wells, 1993 ). 
Hence, not many mutations are needed to evolve a dimer from a monomer in this case. 

The third evolutionary pathway is through a domain swapping mechanism, as suggested by Eisenberg and 
colleagues (Bennett et al., 1994 : Bennett et al., 1995). This includes types (a) and (c) of domain 
swapping dimers, and types (c) and (d) of the 2-state dimers. These dimers contain at least one compact 
domain in a chain. This domain is stable and forms a unique structure in the solvent in the monomeric 
state. Although no stable intermediate is detected in kinetics for types (c) and (d) of the 2-state dimers, it 
is likely that under certain solvent conditions or through mutations of a few residues, the compact units in 
these dimers may be stable in the monomeric form. From the energetics point of view, the swapping 
between the compact domain and the remainder of the chain, whether stable or unstable, contributes to 
stabilize the recognition between the two monomers. From the evolutionary standpoint, mutations and 
deletions of residues may have facilitated the conversion of the protein to be more stable in the domain 
swapping dimer than in the monomeric form (Bennett et al., 1995 ). This type of evolutionary process has 
been nicely demonstrated by deleting six residues in staphylococcal nuclease converting a native 
monomer to an engineered dimer (Green et al., 1995 ). 

It is also likely that in some cases, two evolutionary pathways took place on the same dimer at different 
stages. This was probably the case for type (b) of domain-swapping dimers and type (b) of the 3 -state 
dimers. However, unlike type (c) of domain-swapping dimers, which are mostly stabilized by the 
exchange of the swapping segments, here the exchanged segments in the dimers are small. The small 
swapping segments help strengthen the dimer, but may not have been its initial evolutionary pathway. 
This has been illustrated in the case of BS-RNase (Piccoli et al., 1995 ). These dimers may have initially 
evolved through the second evolutionary pathway. They were further optimized through the domain 
swapping mechanism at a later stage of their evolution. This two-step evolution is consistent with the 
kinetics of BS-RNase. There, the dimer conformation with the swapped segments occurs only after the 
dimer without swapping is formed (Piccoli et al., 1992 ). 

In summary, the domain swapping mechanism is one, but not the only possible evolutionary pathway of 
protein dimerization. Some domain swapping dimers, which are referred to as such in the literature, 
actually follow the evolutionary mechanism observed for most 2-state dimers. Such evolution does not go 
through a stable monomeric ancestor. At the other end of the spectrum, most 3 -state dimers have evolved 
through stable monomers followed by mutations of their surface residues. Some such 3 -state dimers may 
have also experienced domain swapping at a later stage of evolution. We further note that the 
evolutionary mechanisms proposed and discussed here for dimers, can be applied to other oligomers as 
well. 

Materials and methods 
Data set selection 

We collected three groups of dimers from the literature. The structures of the dimers are known from 
crystallography or NMR. The first group includes the ones which were referred to as domain swapping 
dimers in the literature (see Table 2 ). The second and third groups include the 2-state and 3 -state dimers, 
as determined in kinetic experiments. These are listed in Tables 3 and 5. 
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Calculation of the compactness profile 

The coefficient of compactness of a structure is defined as (Zehfiis & Rose, 1986 ) 
MSA of the structure 

Q — _ 

MSA of a sphere of equal volume 9 



C can be simplified to 



MSA 






(4). 



MSA and Tare the molecular surface area and the volume of the structure, respectively. We employed 
the molecular surface area (Richards, 1977 ) rather than the solvent accessible surface area (Lee & 
Richards, 1971), which was used by Zehfus and Rose ( 1986 ). since the former is a more sensitive 
measurement of the molecular shape. 

We have employed the united atom model, which excludes hydrogen atoms, for MSA calculation. The 
areas and volumes were calculated by the molecular surface (MS) program (Connolly, 1993 ). The 
coordinates were obtained from the Brookhaven Protein Database (PDB) (Bernstein et al., 1977 ). The 
atomic radii were taken from the CHARMM parameters (Brooks et al., 1983 ). The solvent probe has a 
radius of 1 .4 A 

Calculation of hydrogen bonds 

Hydrogen bonds across the dimer interfaces are analyzed by HBPLUS (McDonald et al., 1993 ; 
McDonald & Thornton, 1994 ). The program determines the positions of missing hydrogens in the PDB 
and checks each donor-acceptor pair to ascertain whether it fits the geometric criteria as follows: The 
maximum distances are 3.9 A between donor and acceptor and 2.5 A between acceptor and hydrogen; 
the minimum angles are 90.0 degrees for the angle of donor-hydrogen-acceptor, for the angle of 
donor-acceptor-acceptor antecedent, and for the angle of hydrogen-acceptor-acceptor antecedent (Baker 
& Hubbard, 1984 ). Amino-aromatic hydrogen bonds are not taken into account in our analysis. 
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